Dataset statistics
| Number of variables | 29 |
|---|---|
| Number of observations | 78032 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 17.3 MiB |
| Average record size in memory | 232.0 B |
Variable types
| Numeric | 11 |
|---|---|
| Categorical | 9 |
| Boolean | 9 |
Name has a high cardinality: 22710 distinct values | High cardinality |
Address has a high cardinality: 6618 distinct values | High cardinality |
StreetName has a high cardinality: 669 distinct values | High cardinality |
Location has a high cardinality: 56 distinct values | High cardinality |
NAICSDescr has a high cardinality: 1039 distinct values | High cardinality |
RecordID is highly overall correlated with FID and 3 other fields | High correlation |
X is highly overall correlated with PostalCode and 4 other fields | High correlation |
Y is highly overall correlated with CENT_Y | High correlation |
Ward is highly overall correlated with X and 5 other fields | High correlation |
NAICSCode is highly overall correlated with Location and 1 other fields | High correlation |
CENT_X is highly overall correlated with X and 4 other fields | High correlation |
CENT_Y is highly overall correlated with X and 5 other fields | High correlation |
PostalCode is highly overall correlated with X and 5 other fields | High correlation |
Location is highly overall correlated with RecordID and 9 other fields | High correlation |
NAICSCat is highly overall correlated with Location and 1 other fields | High correlation |
Year is highly overall correlated with RecordID and 1 other fields | High correlation |
Age is highly overall correlated with RecordID and 1 other fields | High correlation |
FID is highly overall correlated with RecordID and 5 other fields | High correlation |
BusinessID is highly overall correlated with FID and 1 other fields | High correlation |
Y is highly skewed (γ1 = -120.8370508) | Skewed |
StreetNo is highly skewed (γ1 = 147.6524357) | Skewed |
RecordID is uniformly distributed | Uniform |
RecordID has unique values | Unique |
Reproduction
| Analysis started | 2023-04-01 18:21:00.957265 |
|---|---|
| Analysis finished | 2023-04-01 18:22:05.293776 |
| Duration | 1 minute and 4.34 seconds |
| Software version | pandas-profiling vv3.5.0 |
| Download configuration | config.json |
| Distinct | 78032 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 39016.5 |
| Minimum | 1 |
|---|---|
| Maximum | 78032 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 609.8 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 3902.55 |
| Q1 | 19508.75 |
| median | 39016.5 |
| Q3 | 58524.25 |
| 95-th percentile | 74130.45 |
| Maximum | 78032 |
| Range | 78031 |
| Interquartile range (IQR) | 39015.5 |
Descriptive statistics
| Standard deviation | 22526.042 |
|---|---|
| Coefficient of variation (CV) | 0.57734657 |
| Kurtosis | -1.2 |
| Mean | 39016.5 |
| Median Absolute Deviation (MAD) | 19508 |
| Skewness | 0 |
| Sum | 3.0445355 × 109 |
| Variance | 5.0742259 × 108 |
| Monotonicity | Strictly increasing |
| Value | Count | Frequency (%) |
| 1 | 1 | < 0.1% |
| 52020 | 1 | < 0.1% |
| 52027 | 1 | < 0.1% |
| 52026 | 1 | < 0.1% |
| 52025 | 1 | < 0.1% |
| 52024 | 1 | < 0.1% |
| 52023 | 1 | < 0.1% |
| 52022 | 1 | < 0.1% |
| 52021 | 1 | < 0.1% |
| 52019 | 1 | < 0.1% |
| Other values (78022) | 78022 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 2 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 7 | 1 | |
| 8 | 1 | |
| 9 | 1 | |
| 10 | 1 |
| Value | Count | Frequency (%) |
| 78032 | 1 | |
| 78031 | 1 | |
| 78030 | 1 | |
| 78029 | 1 | |
| 78028 | 1 | |
| 78027 | 1 | |
| 78026 | 1 | |
| 78025 | 1 | |
| 78024 | 1 | |
| 78023 | 1 |
X
Real number (ℝ)
| Distinct | 4284 |
|---|---|
| Distinct (%) | 5.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -79.654547 |
| Minimum | -79.80298 |
|---|---|
| Maximum | -79.550935 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 78032 |
| Negative (%) | 100.0% |
| Memory size | 609.8 KiB |
Quantile statistics
| Minimum | -79.80298 |
|---|---|
| 5-th percentile | -79.743189 |
| Q1 | -79.68296 |
| median | -79.651649 |
| Q3 | -79.621073 |
| 95-th percentile | -79.578693 |
| Maximum | -79.550935 |
| Range | 0.25204547 |
| Interquartile range (IQR) | 0.061886626 |
Descriptive statistics
| Standard deviation | 0.047541739 |
|---|---|
| Coefficient of variation (CV) | -0.00059684903 |
| Kurtosis | -0.087208553 |
| Mean | -79.654547 |
| Median Absolute Deviation (MAD) | 0.031068498 |
| Skewness | -0.39700919 |
| Sum | -6215603.6 |
| Variance | 0.002260217 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| -79.6975994 | 1968 | 2.5% |
| -79.64275968 | 831 | 1.1% |
| -79.60364656 | 652 | 0.8% |
| -79.71222857 | 508 | 0.7% |
| -79.61962672 | 436 | 0.6% |
| -79.63864759 | 423 | 0.5% |
| -79.56936408 | 390 | 0.5% |
| -79.6136892 | 274 | 0.4% |
| -79.75938361 | 252 | 0.3% |
| -79.60455904 | 248 | 0.3% |
| Other values (4274) | 72050 |
| Value | Count | Frequency (%) |
| -79.80298035 | 6 | < 0.1% |
| -79.8014612 | 5 | < 0.1% |
| -79.79447393 | 6 | < 0.1% |
| -79.79439767 | 3 | < 0.1% |
| -79.78884298 | 6 | < 0.1% |
| -79.78871792 | 137 | |
| -79.78850259 | 5 | < 0.1% |
| -79.78675536 | 55 | |
| -79.78630211 | 72 | |
| -79.78452433 | 64 |
| Value | Count | Frequency (%) |
| -79.55093488 | 15 | |
| -79.55280776 | 8 | |
| -79.55341309 | 4 | < 0.1% |
| -79.55391093 | 6 | < 0.1% |
| -79.55445215 | 7 | |
| -79.55472553 | 9 | |
| -79.55507028 | 6 | < 0.1% |
| -79.55523334 | 5 | < 0.1% |
| -79.55532738 | 5 | < 0.1% |
| -79.55542565 | 4 | < 0.1% |
| Distinct | 4284 |
|---|---|
| Distinct (%) | 5.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 43.607593 |
| Minimum | 0 |
|---|---|
| Maximum | 43.732864 |
| Zeros | 5 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 609.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 43.517559 |
| Q1 | 43.576985 |
| median | 43.608267 |
| Q3 | 43.649308 |
| 95-th percentile | 43.698595 |
| Maximum | 43.732864 |
| Range | 43.732864 |
| Interquartile range (IQR) | 0.072322677 |
Descriptive statistics
| Standard deviation | 0.35296606 |
|---|---|
| Coefficient of variation (CV) | 0.0080941423 |
| Kurtosis | 14926.763 |
| Mean | 43.607593 |
| Median Absolute Deviation (MAD) | 0.036033607 |
| Skewness | -120.83705 |
| Sum | 3402787.7 |
| Variance | 0.12458504 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 43.51755854 | 1968 | 2.5% |
| 43.59351505 | 831 | 1.1% |
| 43.67999884 | 652 | 0.8% |
| 43.55837136 | 508 | 0.7% |
| 43.57693412 | 436 | 0.6% |
| 43.72011759 | 423 | 0.5% |
| 43.5935916 | 390 | 0.5% |
| 43.6325595 | 274 | 0.4% |
| 43.58207115 | 252 | 0.3% |
| 43.62508971 | 248 | 0.3% |
| Other values (4274) | 72050 |
| Value | Count | Frequency (%) |
| 0 | 5 | |
| 43.48517014 | 10 | |
| 43.48968489 | 5 | |
| 43.4915708 | 5 | |
| 43.49199992 | 10 | |
| 43.49224252 | 3 | < 0.1% |
| 43.49454092 | 6 | |
| 43.49517064 | 4 | < 0.1% |
| 43.49608236 | 9 | |
| 43.49636475 | 5 |
| Value | Count | Frequency (%) |
| 43.73286372 | 38 | |
| 43.73233211 | 5 | < 0.1% |
| 43.73196635 | 6 | < 0.1% |
| 43.73068152 | 8 | < 0.1% |
| 43.72935757 | 6 | < 0.1% |
| 43.72770692 | 6 | < 0.1% |
| 43.72552272 | 11 | < 0.1% |
| 43.72537511 | 8 | < 0.1% |
| 43.7250583 | 5 | < 0.1% |
| 43.7248112 | 10 | < 0.1% |
FID
Real number (ℝ)
| Distinct | 16518 |
|---|---|
| Distinct (%) | 21.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7823.2043 |
| Minimum | 1 |
|---|---|
| Maximum | 16518 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 609.8 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 781 |
| Q1 | 3902 |
| median | 7804 |
| Q3 | 11705.25 |
| 95-th percentile | 14902 |
| Maximum | 16518 |
| Range | 16517 |
| Interquartile range (IQR) | 7803.25 |
Descriptive statistics
| Standard deviation | 4538.5029 |
|---|---|
| Coefficient of variation (CV) | 0.58013351 |
| Kurtosis | -1.1665353 |
| Mean | 7823.2043 |
| Median Absolute Deviation (MAD) | 3902 |
| Skewness | 0.024756244 |
| Sum | 6.1046028 × 108 |
| Variance | 20598009 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 5 | < 0.1% |
| 9727 | 5 | < 0.1% |
| 9729 | 5 | < 0.1% |
| 9730 | 5 | < 0.1% |
| 9731 | 5 | < 0.1% |
| 9732 | 5 | < 0.1% |
| 9733 | 5 | < 0.1% |
| 9734 | 5 | < 0.1% |
| 9735 | 5 | < 0.1% |
| 9736 | 5 | < 0.1% |
| Other values (16508) | 77982 |
| Value | Count | Frequency (%) |
| 1 | 5 | |
| 2 | 5 | |
| 3 | 5 | |
| 4 | 5 | |
| 5 | 5 | |
| 6 | 5 | |
| 7 | 5 | |
| 8 | 5 | |
| 9 | 5 | |
| 10 | 5 |
| Value | Count | Frequency (%) |
| 16518 | 1 | |
| 16517 | 1 | |
| 16516 | 1 | |
| 16515 | 1 | |
| 16514 | 1 | |
| 16513 | 1 | |
| 16512 | 1 | |
| 16511 | 1 | |
| 16510 | 1 | |
| 16509 | 1 |
BusinessID
Real number (ℝ)
| Distinct | 21240 |
|---|---|
| Distinct (%) | 27.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 34656.267 |
| Minimum | 2 |
|---|---|
| Maximum | 94424 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 609.8 KiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 2230 |
| Q1 | 9764 |
| median | 19182.5 |
| Q3 | 55026 |
| 95-th percentile | 88915 |
| Maximum | 94424 |
| Range | 94422 |
| Interquartile range (IQR) | 45262 |
Descriptive statistics
| Standard deviation | 29857.312 |
|---|---|
| Coefficient of variation (CV) | 0.86152708 |
| Kurtosis | -0.99364033 |
| Mean | 34656.267 |
| Median Absolute Deviation (MAD) | 16019.5 |
| Skewness | 0.65057392 |
| Sum | 2.7042978 × 109 |
| Variance | 8.9145909 × 108 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1055 | 5 | < 0.1% |
| 20882 | 5 | < 0.1% |
| 19580 | 5 | < 0.1% |
| 20871 | 5 | < 0.1% |
| 19831 | 5 | < 0.1% |
| 19332 | 5 | < 0.1% |
| 19583 | 5 | < 0.1% |
| 19832 | 5 | < 0.1% |
| 19584 | 5 | < 0.1% |
| 20872 | 5 | < 0.1% |
| Other values (21230) | 77982 |
| Value | Count | Frequency (%) |
| 2 | 2 | < 0.1% |
| 7 | 5 | |
| 10 | 5 | |
| 12 | 3 | |
| 16 | 5 | |
| 18 | 5 | |
| 20 | 5 | |
| 21 | 5 | |
| 23 | 5 | |
| 26 | 4 |
| Value | Count | Frequency (%) |
| 94424 | 1 | |
| 94423 | 1 | |
| 94419 | 1 | |
| 94371 | 1 | |
| 94321 | 1 | |
| 94319 | 1 | |
| 94318 | 1 | |
| 94317 | 1 | |
| 94313 | 1 | |
| 94293 | 1 |
Name
Categorical
| Distinct | 22710 |
|---|---|
| Distinct (%) | 29.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 609.8 KiB |
| Subway | 212 |
|---|---|
| Tim Hortons | 181 |
| Petro Canada | 123 |
| Shoppers Drug Mart | 102 |
| Tim Horton's | 97 |
| Other values (22705) |
Length
| Max length | 118 |
|---|---|
| Median length | 76 |
| Mean length | 22.654539 |
| Min length | 1 |
Characters and Unicode
| Total characters | 1767779 |
|---|---|
| Distinct characters | 93 |
| Distinct categories | 15 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 3 ? |
Unique
| Unique | 5010 ? |
|---|---|
| Unique (%) | 6.4% |
Sample
| 1st row | Golf Trends Inc. |
|---|---|
| 2nd row | Apex Graphics Inc. |
| 3rd row | Sands, John & Associates Limited |
| 4th row | Printmedia-Tackaberry Times |
| 5th row | S W R Industries Ltd. |
Common Values
| Value | Count | Frequency (%) |
| Subway | 212 | 0.3% |
| Tim Hortons | 181 | 0.2% |
| Petro Canada | 123 | 0.2% |
| Shoppers Drug Mart | 102 | 0.1% |
| Tim Horton's | 97 | 0.1% |
| PLASP Child Care Centre | 96 | 0.1% |
| Dollarama | 92 | 0.1% |
| Starbucks | 88 | 0.1% |
| Shell Canada | 84 | 0.1% |
| Royal Bank of Canada | 78 | 0.1% |
| Other values (22700) | 76879 |
Length
| Value | Count | Frequency (%) |
| inc | 15794 | 5.7% |
| 9127 | 3.3% | |
| ltd | 7946 | 2.9% |
| canada | 4795 | 1.7% |
| centre | 2969 | 1.1% |
| and | 2598 | 0.9% |
| services | 2443 | 0.9% |
| the | 2359 | 0.8% |
| a | 2092 | 0.8% |
| of | 2044 | 0.7% |
| Other values (16113) | 225478 |
Most occurring characters
| Value | Count | Frequency (%) |
| 199927 | 11.3% | |
| e | 132589 | 7.5% |
| a | 128136 | 7.2% |
| n | 115216 | 6.5% |
| i | 104250 | 5.9% |
| r | 101893 | 5.8% |
| o | 97613 | 5.5% |
| t | 94807 | 5.4% |
| s | 77470 | 4.4% |
| l | 62777 | 3.6% |
| Other values (83) | 653101 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1236769 | |
| Uppercase Letter | 275469 | 15.6% |
| Space Separator | 199927 | 11.3% |
| Other Punctuation | 44368 | 2.5% |
| Decimal Number | 4222 | 0.2% |
| Dash Punctuation | 4194 | 0.2% |
| Close Punctuation | 1272 | 0.1% |
| Open Punctuation | 1266 | 0.1% |
| Math Symbol | 178 | < 0.1% |
| Final Punctuation | 99 | < 0.1% |
| Other values (5) | 15 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 132589 | |
| a | 128136 | |
| n | 115216 | |
| i | 104250 | 8.4% |
| r | 101893 | 8.2% |
| o | 97613 | 7.9% |
| t | 94807 | 7.7% |
| s | 77470 | 6.3% |
| l | 62777 | 5.1% |
| c | 60202 | 4.9% |
| Other values (20) | 261816 |
Uppercase Letter
| Value | Count | Frequency (%) |
| C | 35962 | |
| S | 28667 | 10.4% |
| I | 23883 | 8.7% |
| M | 18395 | 6.7% |
| L | 18128 | 6.6% |
| A | 17083 | 6.2% |
| P | 16975 | 6.2% |
| T | 15559 | 5.6% |
| D | 13515 | 4.9% |
| B | 11145 | 4.0% |
| Other values (17) | 76157 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 29521 | |
| & | 7166 | 16.2% |
| , | 3463 | 7.8% |
| ' | 3108 | 7.0% |
| / | 898 | 2.0% |
| : | 88 | 0.2% |
| # | 35 | 0.1% |
| @ | 29 | 0.1% |
| ! | 26 | 0.1% |
| " | 16 | < 0.1% |
| Other values (2) | 18 | < 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 906 | |
| 2 | 760 | |
| 0 | 712 | |
| 4 | 418 | |
| 3 | 334 | 7.9% |
| 9 | 287 | 6.8% |
| 8 | 245 | 5.8% |
| 7 | 197 | 4.7% |
| 5 | 184 | 4.4% |
| 6 | 179 | 4.2% |
Math Symbol
| Value | Count | Frequency (%) |
| + | 152 | |
| | | 25 | 14.0% |
| > | 1 | 0.6% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 1264 | |
| ] | 8 | 0.6% |
Space Separator
| Value | Count | Frequency (%) |
| 199927 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 4194 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 1266 |
Final Punctuation
| Value | Count | Frequency (%) |
| ’ | 99 |
Control
| Value | Count | Frequency (%) |
| 6 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 3 |
Format
| Value | Count | Frequency (%) |
| | 3 |
Currency Symbol
| Value | Count | Frequency (%) |
| $ | 2 |
Other Symbol
| Value | Count | Frequency (%) |
| © | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1512238 | |
| Common | 255541 | 14.5% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 132589 | 8.8% |
| a | 128136 | 8.5% |
| n | 115216 | 7.6% |
| i | 104250 | 6.9% |
| r | 101893 | 6.7% |
| o | 97613 | 6.5% |
| t | 94807 | 6.3% |
| s | 77470 | 5.1% |
| l | 62777 | 4.2% |
| c | 60202 | 4.0% |
| Other values (47) | 537285 |
Common
| Value | Count | Frequency (%) |
| 199927 | ||
| . | 29521 | 11.6% |
| & | 7166 | 2.8% |
| - | 4194 | 1.6% |
| , | 3463 | 1.4% |
| ' | 3108 | 1.2% |
| ( | 1266 | 0.5% |
| ) | 1264 | 0.5% |
| 1 | 906 | 0.4% |
| / | 898 | 0.4% |
| Other values (26) | 3828 | 1.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1767601 | |
| Punctuation | 102 | < 0.1% |
| None | 76 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 199927 | 11.3% | |
| e | 132589 | 7.5% |
| a | 128136 | 7.2% |
| n | 115216 | 6.5% |
| i | 104250 | 5.9% |
| r | 101893 | 5.8% |
| o | 97613 | 5.5% |
| t | 94807 | 5.4% |
| s | 77470 | 4.4% |
| l | 62777 | 3.6% |
| Other values (75) | 652923 |
Punctuation
| Value | Count | Frequency (%) |
| ’ | 99 | |
| | 3 | 2.9% |
None
| Value | Count | Frequency (%) |
| é | 67 | |
| ü | 4 | 5.3% |
| ē | 2 | 2.6% |
| É | 1 | 1.3% |
| ä | 1 | 1.3% |
| © | 1 | 1.3% |
Address
Categorical
| Distinct | 6618 |
|---|---|
| Distinct (%) | 8.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 609.8 KiB |
| 100 City Centre Dr | 953 |
|---|---|
| 5100 Erin Mills Pky | 523 |
| 7205 Goreway Dr | 483 |
| 1250 South Service Rd | 394 |
| 1550 South Gateway Rd | 284 |
| Other values (6613) |
Length
| Max length | 32 |
|---|---|
| Median length | 27 |
| Mean length | 16.625525 |
| Min length | 5 |
Characters and Unicode
| Total characters | 1297323 |
|---|---|
| Distinct characters | 64 |
| Distinct categories | 7 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 292 ? |
|---|---|
| Unique (%) | 0.4% |
Sample
| 1st row | 300 Ambassador Dr |
|---|---|
| 2nd row | 320 Ambassador Dr |
| 3rd row | 320 Ambassador Dr |
| 4th row | 320 Ambassador Dr |
| 5th row | 321 Ambassador Dr |
Common Values
| Value | Count | Frequency (%) |
| 100 City Centre Dr | 953 | 1.2% |
| 5100 Erin Mills Pky | 523 | 0.7% |
| 7205 Goreway Dr | 483 | 0.6% |
| 1250 South Service Rd | 394 | 0.5% |
| 1550 South Gateway Rd | 284 | 0.4% |
| 4141 Dixie Rd | 248 | 0.3% |
| 2225 Erin Mills Pky | 238 | 0.3% |
| 50 Burnhamthorpe Rd W | 229 | 0.3% |
| 2355 Derry Rd E | 212 | 0.3% |
| 2000 Credit Valley Rd | 212 | 0.3% |
| Other values (6608) | 74256 |
Length
| Value | Count | Frequency (%) |
| rd | 28597 | 10.8% |
| dr | 17907 | 6.8% |
| e | 12047 | 4.6% |
| st | 9954 | 3.8% |
| blvd | 8013 | 3.0% |
| w | 7245 | 2.7% |
| dundas | 4805 | 1.8% |
| ave | 3977 | 1.5% |
| matheson | 2625 | 1.0% |
| pky | 2579 | 1.0% |
| Other values (3761) | 165836 |
Most occurring characters
| Value | Count | Frequency (%) |
| 185556 | 14.3% | |
| r | 77071 | 5.9% |
| e | 71979 | 5.5% |
| a | 58783 | 4.5% |
| d | 55945 | 4.3% |
| 0 | 51078 | 3.9% |
| n | 49722 | 3.8% |
| 5 | 48031 | 3.7% |
| t | 47992 | 3.7% |
| i | 45039 | 3.5% |
| Other values (54) | 606127 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 636946 | |
| Decimal Number | 287140 | |
| Uppercase Letter | 187144 | 14.4% |
| Space Separator | 185556 | 14.3% |
| Dash Punctuation | 480 | < 0.1% |
| Other Punctuation | 54 | < 0.1% |
| Modifier Symbol | 3 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| r | 77071 | |
| e | 71979 | |
| a | 58783 | |
| d | 55945 | |
| n | 49722 | 7.8% |
| t | 47992 | 7.5% |
| i | 45039 | 7.1% |
| o | 36413 | 5.7% |
| l | 32505 | 5.1% |
| s | 27700 | 4.3% |
| Other values (15) | 133797 |
Uppercase Letter
| Value | Count | Frequency (%) |
| R | 31751 | |
| D | 29023 | |
| S | 18789 | |
| E | 16442 | |
| B | 14485 | |
| C | 13381 | |
| W | 11748 | 6.3% |
| M | 9512 | 5.1% |
| A | 9382 | 5.0% |
| T | 6499 | 3.5% |
| Other values (14) | 26132 |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 51078 | |
| 5 | 48031 | |
| 1 | 41652 | |
| 2 | 31311 | |
| 3 | 25187 | |
| 6 | 23265 | |
| 7 | 20531 | |
| 4 | 17381 | 6.1% |
| 9 | 14549 | 5.1% |
| 8 | 14155 | 4.9% |
Other Punctuation
| Value | Count | Frequency (%) |
| ' | 46 | |
| . | 8 | 14.8% |
Space Separator
| Value | Count | Frequency (%) |
| 185556 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 480 |
Modifier Symbol
| Value | Count | Frequency (%) |
| ` | 3 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 824090 | |
| Common | 473233 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| r | 77071 | 9.4% |
| e | 71979 | 8.7% |
| a | 58783 | 7.1% |
| d | 55945 | 6.8% |
| n | 49722 | 6.0% |
| t | 47992 | 5.8% |
| i | 45039 | 5.5% |
| o | 36413 | 4.4% |
| l | 32505 | 3.9% |
| R | 31751 | 3.9% |
| Other values (39) | 316890 |
Common
| Value | Count | Frequency (%) |
| 185556 | ||
| 0 | 51078 | 10.8% |
| 5 | 48031 | 10.1% |
| 1 | 41652 | 8.8% |
| 2 | 31311 | 6.6% |
| 3 | 25187 | 5.3% |
| 6 | 23265 | 4.9% |
| 7 | 20531 | 4.3% |
| 4 | 17381 | 3.7% |
| 9 | 14549 | 3.1% |
| Other values (5) | 14692 | 3.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1297323 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 185556 | 14.3% | |
| r | 77071 | 5.9% |
| e | 71979 | 5.5% |
| a | 58783 | 4.5% |
| d | 55945 | 4.3% |
| 0 | 51078 | 3.9% |
| n | 49722 | 3.8% |
| 5 | 48031 | 3.7% |
| t | 47992 | 3.7% |
| i | 45039 | 3.5% |
| Other values (54) | 606127 |
StreetNo
Real number (ℝ)
| Distinct | 3090 |
|---|---|
| Distinct (%) | 4.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2946.1325 |
| Minimum | 1 |
|---|---|
| Maximum | 905629 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 609.8 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 57 |
| Q1 | 1050 |
| median | 2375 |
| Q3 | 5100 |
| 95-th percentile | 7070 |
| Maximum | 905629 |
| Range | 905628 |
| Interquartile range (IQR) | 4050 |
Descriptive statistics
| Standard deviation | 3997.6662 |
|---|---|
| Coefficient of variation (CV) | 1.35692 |
| Kurtosis | 33315.386 |
| Mean | 2946.1325 |
| Median Absolute Deviation (MAD) | 1655 |
| Skewness | 147.65244 |
| Sum | 2.2989261 × 108 |
| Variance | 15981335 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 100 | 1101 | 1.4% |
| 5100 | 601 | 0.8% |
| 7205 | 520 | 0.7% |
| 1250 | 448 | 0.6% |
| 1 | 442 | 0.6% |
| 2000 | 383 | 0.5% |
| 1550 | 359 | 0.5% |
| 50 | 313 | 0.4% |
| 4141 | 310 | 0.4% |
| 2425 | 304 | 0.4% |
| Other values (3080) | 73251 |
| Value | Count | Frequency (%) |
| 1 | 442 | |
| 2 | 198 | |
| 3 | 200 | |
| 4 | 154 | 0.2% |
| 5 | 7 | < 0.1% |
| 6 | 33 | < 0.1% |
| 7 | 25 | < 0.1% |
| 8 | 21 | < 0.1% |
| 9 | 20 | < 0.1% |
| 10 | 154 | 0.2% |
| Value | Count | Frequency (%) |
| 905629 | 1 | < 0.1% |
| 7895 | 138 | |
| 7890 | 7 | < 0.1% |
| 7885 | 79 | |
| 7880 | 6 | < 0.1% |
| 7875 | 30 | < 0.1% |
| 7860 | 5 | < 0.1% |
| 7855 | 5 | < 0.1% |
| 7850 | 4 | < 0.1% |
| 7840 | 1 | < 0.1% |
StreetName
Categorical
| Distinct | 669 |
|---|---|
| Distinct (%) | 0.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 609.8 KiB |
| Dundas St E | 3202 |
|---|---|
| Matheson Blvd E | 2125 |
| Dixie Rd | 1982 |
| Hurontario St | 1971 |
| Lakeshore Rd E | 1628 |
| Other values (664) |
Length
| Max length | 26 |
|---|---|
| Median length | 22 |
| Mean length | 11.945035 |
| Min length | 3 |
Characters and Unicode
| Total characters | 932095 |
|---|---|
| Distinct characters | 53 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 57 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | Ambassador Dr |
|---|---|
| 2nd row | Ambassador Dr |
| 3rd row | Ambassador Dr |
| 4th row | Ambassador Dr |
| 5th row | Ambassador Dr |
Common Values
| Value | Count | Frequency (%) |
| Dundas St E | 3202 | 4.1% |
| Matheson Blvd E | 2125 | 2.7% |
| Dixie Rd | 1982 | 2.5% |
| Hurontario St | 1971 | 2.5% |
| Lakeshore Rd E | 1628 | 2.1% |
| Dundas St W | 1586 | 2.0% |
| City Centre Dr | 1528 | 2.0% |
| Britannia Rd E | 1441 | 1.8% |
| Tomken Rd | 1416 | 1.8% |
| Argentia Rd | 1400 | 1.8% |
| Other values (659) | 59753 |
Length
| Value | Count | Frequency (%) |
| rd | 28598 | 15.4% |
| dr | 17906 | 9.7% |
| e | 12045 | 6.5% |
| st | 9954 | 5.4% |
| blvd | 8011 | 4.3% |
| w | 7247 | 3.9% |
| dundas | 4805 | 2.6% |
| ave | 3978 | 2.1% |
| matheson | 2625 | 1.4% |
| pky | 2575 | 1.4% |
| Other values (665) | 87802 |
Most occurring characters
| Value | Count | Frequency (%) |
| 107515 | 11.5% | |
| r | 77031 | 8.3% |
| e | 71980 | 7.7% |
| a | 58785 | 6.3% |
| d | 55948 | 6.0% |
| n | 49725 | 5.3% |
| t | 47986 | 5.1% |
| i | 45031 | 4.8% |
| o | 36410 | 3.9% |
| l | 32503 | 3.5% |
| Other values (43) | 349181 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 636923 | |
| Uppercase Letter | 187126 | 20.1% |
| Space Separator | 107515 | 11.5% |
| Dash Punctuation | 480 | 0.1% |
| Other Punctuation | 51 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| r | 77031 | |
| e | 71980 | |
| a | 58785 | |
| d | 55948 | |
| n | 49725 | 7.8% |
| t | 47986 | 7.5% |
| i | 45031 | 7.1% |
| o | 36410 | 5.7% |
| l | 32503 | 5.1% |
| s | 27702 | 4.3% |
| Other values (15) | 133822 |
Uppercase Letter
| Value | Count | Frequency (%) |
| R | 31747 | |
| D | 29017 | |
| S | 18788 | |
| E | 16439 | |
| B | 14481 | |
| C | 13374 | |
| W | 11747 | 6.3% |
| M | 9514 | 5.1% |
| A | 9382 | 5.0% |
| T | 6500 | 3.5% |
| Other values (14) | 26137 |
Other Punctuation
| Value | Count | Frequency (%) |
| ' | 45 | |
| . | 6 | 11.8% |
Space Separator
| Value | Count | Frequency (%) |
| 107515 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 480 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 824049 | |
| Common | 108046 | 11.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| r | 77031 | 9.3% |
| e | 71980 | 8.7% |
| a | 58785 | 7.1% |
| d | 55948 | 6.8% |
| n | 49725 | 6.0% |
| t | 47986 | 5.8% |
| i | 45031 | 5.5% |
| o | 36410 | 4.4% |
| l | 32503 | 3.9% |
| R | 31747 | 3.9% |
| Other values (39) | 316903 |
Common
| Value | Count | Frequency (%) |
| 107515 | ||
| - | 480 | 0.4% |
| ' | 45 | < 0.1% |
| . | 6 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 932095 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 107515 | 11.5% | |
| r | 77031 | 8.3% |
| e | 71980 | 7.7% |
| a | 58785 | 6.3% |
| d | 55948 | 6.0% |
| n | 49725 | 5.3% |
| t | 47986 | 5.1% |
| i | 45031 | 4.8% |
| o | 36410 | 3.9% |
| l | 32503 | 3.5% |
| Other values (43) | 349181 |
BldgNo
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 76.3 KiB |
| False | |
|---|---|
| True | 4234 |
| Value | Count | Frequency (%) |
| False | 73798 | |
| True | 4234 | 5.4% |
UnitNo
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 76.3 KiB |
| True | |
|---|---|
| False |
| Value | Count | Frequency (%) |
| True | 53665 | |
| False | 24367 |
PostalCode
Categorical
| Distinct | 37 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 609.8 KiB |
| L4W | |
|---|---|
| L5T | |
| L5N | |
| L4Z | |
| L5L | |
| Other values (32) |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 234096 |
|---|---|
| Distinct characters | 27 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 4 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | L5T |
|---|---|
| 2nd row | L5T |
| 3rd row | L5T |
| 4th row | L5T |
| 5th row | L5T |
Common Values
| Value | Count | Frequency (%) |
| L4W | 12410 | |
| L5T | 8326 | 10.7% |
| L5N | 6083 | 7.8% |
| L4Z | 4952 | 6.3% |
| L5L | 4725 | 6.1% |
| L5B | 4593 | 5.9% |
| L5S | 4273 | 5.5% |
| L5M | 3805 | 4.9% |
| L4T | 3318 | 4.3% |
| L5A | 3293 | 4.2% |
| Other values (27) | 22254 |
Length
| Value | Count | Frequency (%) |
| l4w | 12410 | |
| l5t | 8326 | 10.7% |
| l5n | 6083 | 7.8% |
| l4z | 4952 | 6.3% |
| l5l | 4725 | 6.1% |
| l5b | 4593 | 5.9% |
| l5s | 4273 | 5.5% |
| l5m | 3805 | 4.9% |
| l4t | 3318 | 4.3% |
| l5a | 3293 | 4.2% |
| Other values (26) | 22254 |
Most occurring characters
| Value | Count | Frequency (%) |
| L | 82758 | |
| 5 | 49935 | |
| 4 | 28079 | 12.0% |
| W | 13005 | 5.6% |
| T | 11645 | 5.0% |
| N | 6083 | 2.6% |
| Z | 4952 | 2.1% |
| B | 4595 | 2.0% |
| S | 4273 | 1.8% |
| M | 3806 | 1.6% |
| Other values (17) | 24965 | 10.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 156054 | |
| Decimal Number | 78037 | |
| Lowercase Letter | 5 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| L | 82758 | |
| W | 13005 | 8.3% |
| T | 11645 | 7.5% |
| N | 6083 | 3.9% |
| Z | 4952 | 3.2% |
| B | 4595 | 2.9% |
| S | 4273 | 2.7% |
| M | 3806 | 2.4% |
| A | 3293 | 2.1% |
| V | 3177 | 2.0% |
| Other values (11) | 18467 | 11.8% |
Decimal Number
| Value | Count | Frequency (%) |
| 5 | 49935 | |
| 4 | 28079 | |
| 6 | 21 | < 0.1% |
| 8 | 1 | < 0.1% |
| 7 | 1 | < 0.1% |
Lowercase Letter
| Value | Count | Frequency (%) |
| c | 5 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 156059 | |
| Common | 78037 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| L | 82758 | |
| W | 13005 | 8.3% |
| T | 11645 | 7.5% |
| N | 6083 | 3.9% |
| Z | 4952 | 3.2% |
| B | 4595 | 2.9% |
| S | 4273 | 2.7% |
| M | 3806 | 2.4% |
| A | 3293 | 2.1% |
| V | 3177 | 2.0% |
| Other values (12) | 18472 | 11.8% |
Common
| Value | Count | Frequency (%) |
| 5 | 49935 | |
| 4 | 28079 | |
| 6 | 21 | < 0.1% |
| 8 | 1 | < 0.1% |
| 7 | 1 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 234096 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| L | 82758 | |
| 5 | 49935 | |
| 4 | 28079 | 12.0% |
| W | 13005 | 5.6% |
| T | 11645 | 5.0% |
| N | 6083 | 2.6% |
| Z | 4952 | 2.1% |
| B | 4595 | 2.0% |
| S | 4273 | 1.8% |
| M | 3806 | 1.6% |
| Other values (17) | 24965 | 10.7% |
| Distinct | 56 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 609.8 KiB |
| Northeast EA (West) | |
|---|---|
| Western Business Park EA | |
| Dixie EA | |
| Gateway EA (East) | |
| Meadowvale Business Park CC | |
| Other values (51) |
Length
| Max length | 27 |
|---|---|
| Median length | 23 |
| Mean length | 16.691832 |
| Min length | 7 |
Characters and Unicode
| Total characters | 1302497 |
|---|---|
| Distinct characters | 43 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Gateway EA (East) |
|---|---|
| 2nd row | Gateway EA (East) |
| 3rd row | Gateway EA (East) |
| 4th row | Gateway EA (East) |
| 5th row | Gateway EA (East) |
Common Values
| Value | Count | Frequency (%) |
| Northeast EA (West) | 21104 | |
| Western Business Park EA | 5574 | 7.1% |
| Dixie EA | 4786 | 6.1% |
| Gateway EA (East) | 4760 | 6.1% |
| Meadowvale Business Park CC | 4458 | 5.7% |
| DT Core | 3192 | 4.1% |
| Airport CC | 2231 | 2.9% |
| DT Cooksville | 2065 | 2.6% |
| Northeast EA (East) | 1926 | 2.5% |
| Mavis-Erindale EA | 1822 | 2.3% |
| Other values (46) | 26114 |
Length
| Value | Count | Frequency (%) |
| ea | 42359 | |
| northeast | 23030 | 10.5% |
| west | 22942 | 10.5% |
| nhd | 14276 | 6.5% |
| park | 11037 | 5.1% |
| business | 10032 | 4.6% |
| east | 9149 | 4.2% |
| cc | 7899 | 3.6% |
| gateway | 6773 | 3.1% |
| dt | 6142 | 2.8% |
| Other values (45) | 64655 |
Most occurring characters
| Value | Count | Frequency (%) |
| 140262 | 10.8% | |
| e | 118825 | 9.1% |
| t | 109250 | 8.4% |
| s | 103821 | 8.0% |
| a | 85081 | 6.5% |
| r | 68155 | 5.2% |
| o | 58559 | 4.5% |
| E | 56161 | 4.3% |
| A | 47775 | 3.7% |
| i | 47770 | 3.7% |
| Other values (33) | 466838 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 784844 | |
| Uppercase Letter | 312880 | 24.0% |
| Space Separator | 140262 | 10.8% |
| Close Punctuation | 30649 | 2.4% |
| Open Punctuation | 30649 | 2.4% |
| Dash Punctuation | 3213 | 0.2% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 118825 | |
| t | 109250 | |
| s | 103821 | |
| a | 85081 | |
| r | 68155 | |
| o | 58559 | |
| i | 47770 | |
| l | 32630 | 4.2% |
| n | 29533 | 3.8% |
| h | 27324 | 3.5% |
| Other values (11) | 103896 |
Uppercase Letter
| Value | Count | Frequency (%) |
| E | 56161 | |
| A | 47775 | |
| N | 43931 | |
| C | 34625 | |
| W | 28516 | |
| D | 25204 | |
| H | 15708 | 5.0% |
| M | 13549 | 4.3% |
| P | 13477 | 4.3% |
| B | 10032 | 3.2% |
| Other values (8) | 23902 |
Space Separator
| Value | Count | Frequency (%) |
| 140262 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 30649 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 30649 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 3213 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1097724 | |
| Common | 204773 | 15.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 118825 | 10.8% |
| t | 109250 | 10.0% |
| s | 103821 | 9.5% |
| a | 85081 | 7.8% |
| r | 68155 | 6.2% |
| o | 58559 | 5.3% |
| E | 56161 | 5.1% |
| A | 47775 | 4.4% |
| i | 47770 | 4.4% |
| N | 43931 | 4.0% |
| Other values (29) | 358396 |
Common
| Value | Count | Frequency (%) |
| 140262 | ||
| ) | 30649 | 15.0% |
| ( | 30649 | 15.0% |
| - | 3213 | 1.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1302497 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 140262 | 10.8% | |
| e | 118825 | 9.1% |
| t | 109250 | 8.4% |
| s | 103821 | 8.0% |
| a | 85081 | 6.5% |
| r | 68155 | 5.2% |
| o | 58559 | 4.5% |
| E | 56161 | 4.3% |
| A | 47775 | 3.7% |
| i | 47770 | 3.7% |
| Other values (33) | 466838 |
Ward
Real number (ℝ)
| Distinct | 11 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.3913395 |
| Minimum | 1 |
|---|---|
| Maximum | 11 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 609.8 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 5 |
| median | 5 |
| Q3 | 7 |
| 95-th percentile | 11 |
| Maximum | 11 |
| Range | 10 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 2.4758594 |
|---|---|
| Coefficient of variation (CV) | 0.459229 |
| Kurtosis | 0.01057504 |
| Mean | 5.3913395 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.34308626 |
| Sum | 420697 |
| Variance | 6.12988 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 5 | 33956 | |
| 1 | 6772 | 8.7% |
| 8 | 6086 | 7.8% |
| 7 | 5561 | 7.1% |
| 3 | 5005 | 6.4% |
| 9 | 4687 | 6.0% |
| 11 | 4300 | 5.5% |
| 4 | 4163 | 5.3% |
| 6 | 3584 | 4.6% |
| 2 | 3163 | 4.1% |
| Value | Count | Frequency (%) |
| 1 | 6772 | 8.7% |
| 2 | 3163 | 4.1% |
| 3 | 5005 | 6.4% |
| 4 | 4163 | 5.3% |
| 5 | 33956 | |
| 6 | 3584 | 4.6% |
| 7 | 5561 | 7.1% |
| 8 | 6086 | 7.8% |
| 9 | 4687 | 6.0% |
| 10 | 755 | 1.0% |
| Value | Count | Frequency (%) |
| 11 | 4300 | 5.5% |
| 10 | 755 | 1.0% |
| 9 | 4687 | 6.0% |
| 8 | 6086 | 7.8% |
| 7 | 5561 | 7.1% |
| 6 | 3584 | 4.6% |
| 5 | 33956 | |
| 4 | 4163 | 5.3% |
| 3 | 5005 | 6.4% |
| 2 | 3163 | 4.1% |
NAICSCode
Real number (ℝ)
| Distinct | 24 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 52.937603 |
| Minimum | 11 |
|---|---|
| Maximum | 91 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 609.8 KiB |
Quantile statistics
| Minimum | 11 |
|---|---|
| 5-th percentile | 31 |
| Q1 | 41 |
| median | 52 |
| Q3 | 62 |
| 95-th percentile | 81 |
| Maximum | 91 |
| Range | 80 |
| Interquartile range (IQR) | 21 |
Descriptive statistics
| Standard deviation | 15.992614 |
|---|---|
| Coefficient of variation (CV) | 0.3021031 |
| Kurtosis | -0.68005714 |
| Mean | 52.937603 |
| Median Absolute Deviation (MAD) | 10 |
| Skewness | 0.30420754 |
| Sum | 4130827 |
| Variance | 255.7637 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 81 | 9052 | |
| 44 | 9014 | |
| 41 | 8749 | |
| 54 | 7102 | |
| 62 | 6459 | 8.3% |
| 72 | 6148 | 7.9% |
| 33 | 5710 | 7.3% |
| 61 | 3050 | 3.9% |
| 52 | 2995 | 3.8% |
| 48 | 2889 | 3.7% |
| Other values (14) | 16864 |
| Value | Count | Frequency (%) |
| 11 | 6 | < 0.1% |
| 21 | 15 | < 0.1% |
| 22 | 63 | 0.1% |
| 23 | 2783 | 3.6% |
| 31 | 1144 | 1.5% |
| 32 | 2828 | 3.6% |
| 33 | 5710 | |
| 41 | 8749 | |
| 44 | 9014 | |
| 45 | 2057 | 2.6% |
| Value | Count | Frequency (%) |
| 91 | 460 | 0.6% |
| 81 | 9052 | |
| 72 | 6148 | |
| 71 | 1039 | 1.3% |
| 62 | 6459 | |
| 61 | 3050 | 3.9% |
| 56 | 2607 | 3.3% |
| 55 | 504 | 0.6% |
| 54 | 7102 | |
| 53 | 1838 | 2.4% |
NAICSCat
Categorical
| Distinct | 19 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 609.8 KiB |
| Retail Trade | |
|---|---|
| Manufacturing | |
| Other Services | |
| Wholesale Trade | |
| Professional, Scientific and Technical Services | |
| Other values (14) |
Length
| Max length | 69 |
|---|---|
| Median length | 35 |
| Mean length | 23.769364 |
| Min length | 9 |
Characters and Unicode
| Total characters | 1854771 |
|---|---|
| Distinct characters | 37 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Wholesale Trade |
|---|---|
| 2nd row | Manufacturing |
| 3rd row | Manufacturing |
| 4th row | Manufacturing |
| 5th row | Wholesale Trade |
Common Values
| Value | Count | Frequency (%) |
| Retail Trade | 11071 | |
| Manufacturing | 9682 | |
| Other Services | 9053 | |
| Wholesale Trade | 8749 | |
| Professional, Scientific and Technical Services | 7102 | |
| Health Care and Social Assistance | 6459 | |
| Accommodation and Food Services | 6148 | |
| Transportation and Warehousing | 3789 | 4.9% |
| Educational Services | 3050 | 3.9% |
| Finance and Insurance | 2995 | 3.8% |
| Other values (9) | 9934 |
Length
| Value | Count | Frequency (%) |
| and | 37544 | |
| services | 27960 | 12.1% |
| trade | 19820 | 8.6% |
| retail | 11071 | 4.8% |
| manufacturing | 9682 | 4.2% |
| other | 9053 | 3.9% |
| wholesale | 8749 | 3.8% |
| professional | 7102 | 3.1% |
| scientific | 7102 | 3.1% |
| technical | 7102 | 3.1% |
| Other values (35) | 85934 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 194173 | 10.5% |
| a | 191887 | 10.3% |
| 153087 | 8.3% | |
| n | 145859 | 7.9% |
| i | 140943 | 7.6% |
| r | 108943 | 5.9% |
| c | 104586 | 5.6% |
| t | 102451 | 5.5% |
| s | 96869 | 5.2% |
| o | 89096 | 4.8% |
| Other values (27) | 526877 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1497865 | |
| Uppercase Letter | 193071 | 10.4% |
| Space Separator | 153087 | 8.3% |
| Other Punctuation | 10748 | 0.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 194173 | |
| a | 191887 | |
| n | 145859 | |
| i | 140943 | |
| r | 108943 | |
| c | 104586 | |
| t | 102451 | |
| s | 96869 | 6.5% |
| o | 89096 | 5.9% |
| d | 79025 | 5.3% |
| Other values (10) | 244033 |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 44128 | |
| T | 30711 | |
| R | 18391 | |
| A | 16713 | 8.7% |
| W | 15145 | 7.8% |
| M | 12793 | 6.6% |
| C | 10366 | 5.4% |
| F | 9143 | 4.7% |
| O | 9053 | 4.7% |
| P | 7583 | 3.9% |
| Other values (5) | 19045 |
Space Separator
| Value | Count | Frequency (%) |
| 153087 |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 10748 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1690936 | |
| Common | 163835 | 8.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 194173 | |
| a | 191887 | |
| n | 145859 | 8.6% |
| i | 140943 | 8.3% |
| r | 108943 | 6.4% |
| c | 104586 | 6.2% |
| t | 102451 | 6.1% |
| s | 96869 | 5.7% |
| o | 89096 | 5.3% |
| d | 79025 | 4.7% |
| Other values (25) | 437104 |
Common
| Value | Count | Frequency (%) |
| 153087 | ||
| , | 10748 | 6.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1854771 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 194173 | 10.5% |
| a | 191887 | 10.3% |
| 153087 | 8.3% | |
| n | 145859 | 7.9% |
| i | 140943 | 7.6% |
| r | 108943 | 5.9% |
| c | 104586 | 5.6% |
| t | 102451 | 5.5% |
| s | 96869 | 5.2% |
| o | 89096 | 4.8% |
| Other values (27) | 526877 |
NAICSDescr
Categorical
| Distinct | 1039 |
|---|---|
| Distinct (%) | 1.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 609.8 KiB |
| Limited-service eating places | 3647 |
|---|---|
| General Automotive Repair | 1992 |
| Full-service restaurants | 1777 |
| Offices of Dentists | 1603 |
| Offices of Physicians | 1504 |
| Other values (1034) |
Length
| Max length | 175 |
|---|---|
| Median length | 80 |
| Mean length | 35.436385 |
| Min length | 6 |
Characters and Unicode
| Total characters | 2765172 |
|---|---|
| Distinct characters | 61 |
| Distinct categories | 8 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 2 ? |
Unique
| Unique | 124 ? |
|---|---|
| Unique (%) | 0.2% |
Sample
| 1st row | Amusement and Sporting Goods Wholesaler-Distributors |
|---|---|
| 2nd row | Support Activities for Printing |
| 3rd row | Support Activities for Printing |
| 4th row | Other Printing |
| 5th row | Industrial Machinery, Equipment and Supplies Wholesaler-Distributors |
Common Values
| Value | Count | Frequency (%) |
| Limited-service eating places | 3647 | 4.7% |
| General Automotive Repair | 1992 | 2.6% |
| Full-service restaurants | 1777 | 2.3% |
| Offices of Dentists | 1603 | 2.1% |
| Offices of Physicians | 1504 | 1.9% |
| Offices of Lawyers | 1376 | 1.8% |
| Beauty Salons | 1302 | 1.7% |
| Other Freight Transportation Arrangement | 1255 | 1.6% |
| Elementary and Secondary Schools | 1240 | 1.6% |
| Religious Organizations | 1098 | 1.4% |
| Other values (1029) | 61238 |
Length
| Value | Count | Frequency (%) |
| and | 33347 | 10.0% |
| other | 18681 | 5.6% |
| stores | 9245 | 2.8% |
| offices | 8694 | 2.6% |
| of | 8405 | 2.5% |
| services | 8315 | 2.5% |
| all | 8273 | 2.5% |
| wholesaler-distributors | 7178 | 2.1% |
| manufacturing | 6730 | 2.0% |
| supplies | 4486 | 1.3% |
| Other values (1054) | 221747 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 278627 | 10.1% |
| 258164 | 9.3% | |
| i | 198022 | 7.2% |
| r | 189307 | 6.8% |
| n | 183101 | 6.6% |
| t | 181749 | 6.6% |
| a | 181007 | 6.5% |
| s | 160174 | 5.8% |
| o | 139412 | 5.0% |
| l | 115516 | 4.2% |
| Other values (51) | 880093 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 2193494 | |
| Uppercase Letter | 276079 | 10.0% |
| Space Separator | 258605 | 9.4% |
| Dash Punctuation | 17709 | 0.6% |
| Other Punctuation | 11390 | 0.4% |
| Open Punctuation | 4149 | 0.2% |
| Close Punctuation | 3340 | 0.1% |
| Control | 406 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 278627 | |
| i | 198022 | |
| r | 189307 | |
| n | 183101 | 8.3% |
| t | 181749 | 8.3% |
| a | 181007 | 8.3% |
| s | 160174 | 7.3% |
| o | 139412 | 6.4% |
| l | 115516 | 5.3% |
| c | 105666 | 4.8% |
| Other values (16) | 460913 |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 38648 | |
| O | 30856 | |
| A | 24817 | 9.0% |
| C | 24436 | 8.9% |
| M | 21775 | 7.9% |
| P | 18986 | 6.9% |
| D | 14648 | 5.3% |
| W | 12588 | 4.6% |
| E | 11736 | 4.3% |
| F | 11266 | 4.1% |
| Other values (15) | 66323 |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 9665 | |
| ' | 803 | 7.1% |
| & | 488 | 4.3% |
| . | 434 | 3.8% |
Space Separator
| Value | Count | Frequency (%) |
| 258164 | ||
| 441 | 0.2% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 17709 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 4149 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 3340 |
Control
| Value | Count | Frequency (%) |
| 406 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2469573 | |
| Common | 295599 | 10.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 278627 | 11.3% |
| i | 198022 | 8.0% |
| r | 189307 | 7.7% |
| n | 183101 | 7.4% |
| t | 181749 | 7.4% |
| a | 181007 | 7.3% |
| s | 160174 | 6.5% |
| o | 139412 | 5.6% |
| l | 115516 | 4.7% |
| c | 105666 | 4.3% |
| Other values (41) | 736992 |
Common
| Value | Count | Frequency (%) |
| 258164 | ||
| - | 17709 | 6.0% |
| , | 9665 | 3.3% |
| ( | 4149 | 1.4% |
| ) | 3340 | 1.1% |
| ' | 803 | 0.3% |
| & | 488 | 0.2% |
| 441 | 0.1% | |
| . | 434 | 0.1% |
| 406 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2764731 | |
| None | 441 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 278627 | 10.1% |
| 258164 | 9.3% | |
| i | 198022 | 7.2% |
| r | 189307 | 6.8% |
| n | 183101 | 6.6% |
| t | 181749 | 6.6% |
| a | 181007 | 6.5% |
| s | 160174 | 5.8% |
| o | 139412 | 5.0% |
| l | 115516 | 4.2% |
| Other values (50) | 879652 |
None
| Value | Count | Frequency (%) |
| 441 |
Phone
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 76.3 KiB |
| True | |
|---|---|
| False | 633 |
| Value | Count | Frequency (%) |
| True | 77399 | |
| False | 633 | 0.8% |
Fax
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 76.3 KiB |
| True | |
|---|---|
| False |
| Value | Count | Frequency (%) |
| True | 50803 | |
| False | 27229 |
TollFree
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 76.3 KiB |
| False | |
|---|---|
| True |
| Value | Count | Frequency (%) |
| False | 66596 | |
| True | 11436 | 14.7% |
EMail
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 76.3 KiB |
| True | |
|---|---|
| False |
| Value | Count | Frequency (%) |
| True | 47406 | |
| False | 30626 |
WebAddress
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 76.3 KiB |
| True | |
|---|---|
| False |
| Value | Count | Frequency (%) |
| True | 56765 | |
| False | 21267 | 27.3% |
EmplRange
Real number (ℝ)
| Distinct | 9 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.1437743 |
| Minimum | 1 |
|---|---|
| Maximum | 9 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 609.8 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 2 |
| Q3 | 3 |
| 95-th percentile | 5 |
| Maximum | 9 |
| Range | 8 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.4384102 |
|---|---|
| Coefficient of variation (CV) | 0.67097088 |
| Kurtosis | 1.3215574 |
| Mean | 2.1437743 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 1.3064146 |
| Sum | 167283 |
| Variance | 2.0690238 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 37312 | |
| 2 | 16050 | |
| 3 | 10510 | 13.5% |
| 4 | 8120 | 10.4% |
| 5 | 3313 | 4.2% |
| 6 | 2149 | 2.8% |
| 7 | 318 | 0.4% |
| 8 | 164 | 0.2% |
| 9 | 96 | 0.1% |
| Value | Count | Frequency (%) |
| 1 | 37312 | |
| 2 | 16050 | |
| 3 | 10510 | 13.5% |
| 4 | 8120 | 10.4% |
| 5 | 3313 | 4.2% |
| 6 | 2149 | 2.8% |
| 7 | 318 | 0.4% |
| 8 | 164 | 0.2% |
| 9 | 96 | 0.1% |
| Value | Count | Frequency (%) |
| 9 | 96 | 0.1% |
| 8 | 164 | 0.2% |
| 7 | 318 | 0.4% |
| 6 | 2149 | 2.8% |
| 5 | 3313 | 4.2% |
| 4 | 8120 | 10.4% |
| 3 | 10510 | 13.5% |
| 2 | 16050 | |
| 1 | 37312 |
CENT_X
Real number (ℝ)
| Distinct | 9085 |
|---|---|
| Distinct (%) | 11.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 608667.3 |
| Minimum | 596627.93 |
|---|---|
| Maximum | 617060.11 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 609.8 KiB |
Quantile statistics
| Minimum | 596627.93 |
|---|---|
| 5-th percentile | 601477.61 |
| Q1 | 606588.08 |
| median | 609003.29 |
| Q3 | 611252.11 |
| 95-th percentile | 614719.98 |
| Maximum | 617060.11 |
| Range | 20432.171 |
| Interquartile range (IQR) | 4664.0241 |
Descriptive statistics
| Standard deviation | 3790.1103 |
|---|---|
| Coefficient of variation (CV) | 0.0062268999 |
| Kurtosis | 0.047766822 |
| Mean | 608667.3 |
| Median Absolute Deviation (MAD) | 2330.119 |
| Skewness | -0.44615725 |
| Sum | 4.7495526 × 1010 |
| Variance | 14364936 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 609549.2555 | 1701 | 2.2% |
| 609556.5032 | 709 | 0.9% |
| 612552.1674 | 532 | 0.7% |
| 604009.418 | 436 | 0.6% |
| 609657.7584 | 354 | 0.5% |
| 615480.8966 | 322 | 0.4% |
| 611454.4028 | 277 | 0.4% |
| 611830.703 | 223 | 0.3% |
| 608539.0792 | 209 | 0.3% |
| 612581.1624 | 209 | 0.3% |
| Other values (9075) | 73060 |
| Value | Count | Frequency (%) |
| 596627.9342 | 4 | < 0.1% |
| 596636.3174 | 1 | < 0.1% |
| 596752.9696 | 4 | < 0.1% |
| 596761.7476 | 1 | < 0.1% |
| 597263.154 | 1 | < 0.1% |
| 597309.0542 | 6 | < 0.1% |
| 597312.632 | 3 | < 0.1% |
| 597730.9671 | 23 | < 0.1% |
| 597763.1149 | 2 | < 0.1% |
| 597772.3526 | 111 |
| Value | Count | Frequency (%) |
| 617060.1055 | 1 | < 0.1% |
| 616985.0552 | 16 | |
| 616918.4738 | 1 | < 0.1% |
| 616917.8604 | 4 | < 0.1% |
| 616879.86 | 3 | < 0.1% |
| 616839.6893 | 1 | < 0.1% |
| 616837.5953 | 1 | < 0.1% |
| 616836.9092 | 5 | < 0.1% |
| 616794.193 | 4 | < 0.1% |
| 616769.3441 | 1 | < 0.1% |
CENT_Y
Real number (ℝ)
| Distinct | 12366 |
|---|---|
| Distinct (%) | 15.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4829527.1 |
| Minimum | 4815546.6 |
|---|---|
| Maximum | 4843107.8 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 609.8 KiB |
Quantile statistics
| Minimum | 4815546.6 |
|---|---|
| 5-th percentile | 4819128.2 |
| Q1 | 4825873.6 |
| median | 4829298.9 |
| Q3 | 4833808.1 |
| 95-th percentile | 4839298.2 |
| Maximum | 4843107.8 |
| Range | 27561.199 |
| Interquartile range (IQR) | 7934.5206 |
Descriptive statistics
| Standard deviation | 5795.145 |
|---|---|
| Coefficient of variation (CV) | 0.0011999405 |
| Kurtosis | -0.63809248 |
| Mean | 4829527.1 |
| Median Absolute Deviation (MAD) | 3954.81 |
| Skewness | -0.052466864 |
| Sum | 3.7685766 × 1011 |
| Variance | 33583706 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 4819128.214 | 1701 | 2.2% |
| 4837278.362 | 532 | 0.7% |
| 4827620.949 | 520 | 0.7% |
| 4823628.592 | 323 | 0.4% |
| 4825810.215 | 277 | 0.4% |
| 4841687.188 | 244 | 0.3% |
| 4827728.859 | 231 | 0.3% |
| 4827535.97 | 201 | 0.3% |
| 4827620.949 | 189 | 0.2% |
| 4831996.045 | 179 | 0.2% |
| Other values (12356) | 73635 |
| Value | Count | Frequency (%) |
| 4815546.641 | 3 | |
| 4815549.405 | 1 | < 0.1% |
| 4815601.213 | 1 | < 0.1% |
| 4815609.051 | 3 | |
| 4815609.051 | 1 | < 0.1% |
| 4816100.511 | 1 | < 0.1% |
| 4816109.607 | 4 | |
| 4816303.869 | 1 | < 0.1% |
| 4816333.508 | 4 | |
| 4816361.694 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 4843107.84 | 26 | |
| 4843107.84 | 10 | < 0.1% |
| 4843106.933 | 3 | < 0.1% |
| 4843045.912 | 1 | < 0.1% |
| 4843040.829 | 3 | < 0.1% |
| 4843040.829 | 1 | < 0.1% |
| 4842998.68 | 3 | < 0.1% |
| 4842998.68 | 1 | < 0.1% |
| 4842995.781 | 2 | < 0.1% |
| 4842855.077 | 1 | < 0.1% |
Year
Categorical
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 609.8 KiB |
| 2019 | |
|---|---|
| 2018 | |
| 2017 | |
| 2021 | |
| 2016 |
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
Characters and Unicode
| Total characters | 312128 |
|---|---|
| Distinct characters | 7 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2016 |
|---|---|
| 2nd row | 2016 |
| 3rd row | 2016 |
| 4th row | 2016 |
| 5th row | 2016 |
Common Values
| Value | Count | Frequency (%) |
| 2019 | 16518 | |
| 2018 | 16350 | |
| 2017 | 15737 | |
| 2021 | 14825 | |
| 2016 | 14602 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2019 | 16518 | |
| 2018 | 16350 | |
| 2017 | 15737 | |
| 2021 | 14825 | |
| 2016 | 14602 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 92857 | |
| 0 | 78032 | |
| 1 | 78032 | |
| 9 | 16518 | 5.3% |
| 8 | 16350 | 5.2% |
| 7 | 15737 | 5.0% |
| 6 | 14602 | 4.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 312128 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 92857 | |
| 0 | 78032 | |
| 1 | 78032 | |
| 9 | 16518 | 5.3% |
| 8 | 16350 | 5.2% |
| 7 | 15737 | 5.0% |
| 6 | 14602 | 4.7% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 312128 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 92857 | |
| 0 | 78032 | |
| 1 | 78032 | |
| 9 | 16518 | 5.3% |
| 8 | 16350 | 5.2% |
| 7 | 15737 | 5.0% |
| 6 | 14602 | 4.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 312128 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 92857 | |
| 0 | 78032 | |
| 1 | 78032 | |
| 9 | 16518 | 5.3% |
| 8 | 16350 | 5.2% |
| 7 | 15737 | 5.0% |
| 6 | 14602 | 4.7% |
Age
Categorical
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 609.8 KiB |
| 1 | |
|---|---|
| 2 | |
| 3 | |
| 4 | |
| 5 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 78032 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 21240 | |
| 2 | 18801 | |
| 3 | 15727 | |
| 4 | 12761 | |
| 5 | 9503 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1 | 21240 | |
| 2 | 18801 | |
| 3 | 15727 | |
| 4 | 12761 | |
| 5 | 9503 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 21240 | |
| 2 | 18801 | |
| 3 | 15727 | |
| 4 | 12761 | |
| 5 | 9503 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 78032 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 21240 | |
| 2 | 18801 | |
| 3 | 15727 | |
| 4 | 12761 | |
| 5 | 9503 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 78032 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 21240 | |
| 2 | 18801 | |
| 3 | 15727 | |
| 4 | 12761 | |
| 5 | 9503 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 78032 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 21240 | |
| 2 | 18801 | |
| 3 | 15727 | |
| 4 | 12761 | |
| 5 | 9503 |
isnew
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 76.3 KiB |
| False | |
|---|---|
| True | 6884 |
| Value | Count | Frequency (%) |
| False | 71148 | |
| True | 6884 | 8.8% |
Closed
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 76.3 KiB |
| False | |
|---|---|
| True | 6415 |
| Value | Count | Frequency (%) |
| False | 71617 | |
| True | 6415 | 8.2% |
Auto
The auto setting is an interpretable pairwise column metric of the following mapping:- Variable_type-Variable_type : Method, Range
- Categorical-Categorical : Cramer's V, [0,1]
- Numerical-Categorical : Cramer's V, [0,1] (using a discretized numerical column)
- Numerical-Numerical : Spearman's ρ, [-1,1]
This configuration uses the recommended metric for each pair of columns.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.| RecordID | X | Y | FID | BusinessID | Name | Address | StreetNo | StreetName | BldgNo | UnitNo | PostalCode | Location | Ward | NAICSCode | NAICSCat | NAICSDescr | Phone | Fax | TollFree | WebAddress | EmplRange | CENT_X | CENT_Y | Year | Age | isnew | Closed | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | -79.689829 | 43.644181 | 1 | 1055 | Golf Trends Inc. | 300 Ambassador Dr | 300 | Ambassador Dr | No | No | L5T | Gateway EA (East) | 5 | 41 | Wholesale Trade | Amusement and Sporting Goods Wholesaler-Distributors | Yes | Yes | Yes | Yes | Yes | 3 | 605668.2538 | 4.833187e+06 | 2016 | 1 | No | No |
| 1 | 2 | -79.689419 | 43.644988 | 2 | 1057 | Apex Graphics Inc. | 320 Ambassador Dr | 320 | Ambassador Dr | No | No | L5T | Gateway EA (East) | 5 | 32 | Manufacturing | Support Activities for Printing | Yes | Yes | No | Yes | Yes | 4 | 605699.9370 | 4.833277e+06 | 2016 | 1 | No | No |
| 2 | 3 | -79.689419 | 43.644988 | 3 | 1058 | Sands, John & Associates Limited | 320 Ambassador Dr | 320 | Ambassador Dr | No | No | L5T | Gateway EA (East) | 5 | 32 | Manufacturing | Support Activities for Printing | Yes | Yes | No | No | No | 5 | 605699.9370 | 4.833277e+06 | 2016 | 1 | No | No |
| 3 | 4 | -79.689419 | 43.644988 | 4 | 1060 | Printmedia-Tackaberry Times | 320 Ambassador Dr | 320 | Ambassador Dr | No | No | L5T | Gateway EA (East) | 5 | 32 | Manufacturing | Other Printing | Yes | Yes | No | Yes | Yes | 1 | 605699.9370 | 4.833277e+06 | 2016 | 1 | No | No |
| 4 | 5 | -79.690664 | 43.645493 | 5 | 1061 | S W R Industries Ltd. | 321 Ambassador Dr | 321 | Ambassador Dr | No | No | L5T | Gateway EA (East) | 5 | 41 | Wholesale Trade | Industrial Machinery, Equipment and Supplies Wholesaler-Distributors | Yes | Yes | No | Yes | Yes | 2 | 605598.6442 | 4.833332e+06 | 2016 | 1 | No | No |
| 5 | 6 | -79.690277 | 43.646372 | 6 | 1063 | Crossdock Freight Solutions | 361 Ambassador Dr | 361 | Ambassador Dr | No | No | L5T | Gateway EA (East) | 5 | 48 | Transportation and Warehousing | Other Freight Transportation Arrangement | Yes | Yes | No | Yes | Yes | 4 | 605628.2838 | 4.833430e+06 | 2016 | 1 | No | No |
| 6 | 7 | -79.689877 | 43.646914 | 7 | 1065 | Green Belting Industries Ltd. | 381 Ambassador Dr | 381 | Ambassador Dr | No | No | L5T | Gateway EA (East) | 5 | 32 | Manufacturing | Paint and Coating Manufacturing | Yes | Yes | Yes | Yes | Yes | 5 | 605659.5646 | 4.833490e+06 | 2016 | 1 | No | No |
| 7 | 8 | -79.634279 | 43.640404 | 8 | 1073 | Dafco Filtration Group Corporation | 5390 Ambler Dr | 5390 | Ambler Dr | No | Yes | L4W | Northeast EA (West) | 5 | 33 | Manufacturing | Industrial and Commercial Fan and Blower and Air Purification Equipment Manufacturing | Yes | Yes | No | Yes | Yes | 5 | 610155.4182 | 4.832840e+06 | 2016 | 1 | No | No |
| 8 | 9 | -79.632844 | 43.641337 | 9 | 1074 | Ace Trans Inc. | 5391 Ambler Dr | 5391 | Ambler Dr | No | Yes | L4W | Northeast EA (West) | 5 | 49 | Transportation and Warehousing | General Warehousing and Storage | Yes | Yes | No | Yes | Yes | 1 | 610269.4640 | 4.832945e+06 | 2016 | 1 | No | No |
| 9 | 10 | -79.637815 | 43.642638 | 10 | 1077 | Petro Maxx | 5510 Ambler Dr | 5510 | Ambler Dr | No | Yes | L4W | Northeast EA (West) | 5 | 54 | Professional, Scientific and Technical Services | Other Specialized Design Services | Yes | No | No | Yes | Yes | 4 | 609866.1452 | 4.833083e+06 | 2016 | 1 | No | No |
| RecordID | X | Y | FID | BusinessID | Name | Address | StreetNo | StreetName | BldgNo | UnitNo | PostalCode | Location | Ward | NAICSCode | NAICSCat | NAICSDescr | Phone | Fax | TollFree | WebAddress | EmplRange | CENT_X | CENT_Y | Year | Age | isnew | Closed | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 78022 | 78023 | -79.652774 | 43.709466 | 14816 | 57550 | Advance Car & Truck Rental | 2960 Drew Rd | 2960 | Drew Rd | No | Yes | L4T | Northeast EA (West) | 5 | 53 | Real Estate and Rental and Leasing | Passenger Car Rental | Yes | Yes | Yes | Yes | Yes | 1 | 608544.3664 | 4.840490e+06 | 2021 | 5 | No | No |
| 78023 | 78024 | -79.652774 | 43.709466 | 14817 | 57551 | Video Palace | 2960 Drew Rd | 2960 | Drew Rd | No | Yes | L4T | Northeast EA (West) | 5 | 53 | Real Estate and Rental and Leasing | All Other Consumer Goods Rental | Yes | No | No | No | No | 1 | 608544.3664 | 4.840490e+06 | 2021 | 3 | No | No |
| 78024 | 78025 | -79.652774 | 43.709466 | 14818 | 57552 | Secure Life Insurance Agency Inc. | 2960 Drew Rd | 2960 | Drew Rd | No | Yes | L4T | Northeast EA (West) | 5 | 52 | Finance and Insurance | Direct Group Life, Health and Medical Insurance Carriers | Yes | Yes | Yes | No | Yes | 1 | 608544.3664 | 4.840490e+06 | 2021 | 5 | No | No |
| 78025 | 78026 | -79.652774 | 43.709466 | 14819 | 57555 | Skillman Flooring | 2960 Drew Rd | 2960 | Drew Rd | No | Yes | L4T | Northeast EA (West) | 5 | 44 | Retail Trade | Floor Covering Stores | Yes | Yes | No | Yes | Yes | 1 | 608544.3664 | 4.840490e+06 | 2021 | 5 | No | No |
| 78026 | 78027 | -79.652774 | 43.709466 | 14820 | 57557 | Verma Vastar Manufacturing Inc. | 2960 Drew Rd | 2960 | Drew Rd | No | Yes | L4T | Northeast EA (West) | 5 | 31 | Manufacturing | Cut and Sew Clothing Contracting | Yes | No | No | No | No | 1 | 608544.3664 | 4.840490e+06 | 2021 | 4 | No | No |
| 78027 | 78028 | -79.652774 | 0.000000 | 14821 | 60142 | JobsForU | 2960 Drew Rd | 2960 | Drew Rd | No | Yes | L4T | Northeast EA (East) | 5 | 56 | Administrative and Support, Waste Management and Remediation Services | Employment Placement Agencies and Executive Search Services | Yes | No | No | Yes | Yes | 3 | 608544.3664 | 4.840490e+06 | 2021 | 1 | Yes | No |
| 78028 | 78029 | -79.652774 | 0.000000 | 14822 | 60159 | Elite Source Solutions | 2980 Drew Rd | 2980 | Drew Rd | No | Yes | L4T | Northeast EA (East) | 5 | 56 | Administrative and Support, Waste Management and Remediation Services | Employment Placement Agencies and Executive Search Services | Yes | No | No | No | No | 1 | 608544.3664 | 4.840490e+06 | 2021 | 1 | Yes | No |
| 78029 | 78030 | -79.652774 | 0.000000 | 14823 | 60160 | Indian Sweet Master | 2980 Drew Rd | 2980 | Drew Rd | No | Yes | L4T | Northeast EA (East) | 5 | 72 | Accommodation and Food Services | Full-service restaurants | Yes | No | No | No | No | 1 | 608544.3664 | 4.840490e+06 | 2021 | 1 | Yes | No |
| 78030 | 78031 | -79.652774 | 0.000000 | 14824 | 60161 | Mississauga Flooring & Supplies Inc. | 2980 Drew Rd | 2980 | Drew Rd | No | Yes | L4T | Northeast EA (East) | 5 | 41 | Wholesale Trade | Floor Covering Wholesaler-Distributors | Yes | No | No | No | No | 1 | 608544.3664 | 4.840490e+06 | 2021 | 1 | Yes | No |
| 78031 | 78032 | -79.652774 | 0.000000 | 14825 | 60162 | Punjabi Textile Ltd. | 2980 Drew Rd | 2980 | Drew Rd | No | Yes | L4T | Northeast EA (East) | 5 | 41 | Wholesale Trade | Clothing and Clothing Accessories Wholesaler-Distributors | Yes | No | No | No | No | 1 | 608544.3664 | 4.840490e+06 | 2021 | 1 | Yes | No |